Efficient and Effective Querying by Image Content

In the Q31C (Query By Image Content) projec we are studying methods to query large on-line image databeas using the images' content as the basis of the queries. Examples of the content we use include color, texture, sketch, snd shape of image objects and regions. Potential applications incude medical ("Give me other images that contain a tumor with a texture like this one"), photo-journalism( "Give me images that have blue at the top and red at the bottom"), and many others in art, fashion, cataloging, retailing, and industry.

We describe a set of novel features and similarity measures allowing query by image content, together with the QBIC system we implemented. We demonstrate the effectiveness of our system with normalized precision and recall experiments on test databases containing over 1000 images and 1000 objects populated from commercially available photo clip art images, and of images of airplane silhouettes. We also present novel methods for efficient processing of QBIC types of queries, that consist of filtering and indexing steps. We are specificaily addressing two problems: (a) non-Euclidean distance measures; and (b) high dimensionality of feature vectors. For the first problem, we introduce a new theorem at makes efficient filtering possible by bounding the non-Euclidean, full cross-term quadratic distance expression with a simple Euclidean distance. For the second, we illustrate how orthoganal transforms, such as the Karhunen Loeve transform, can help reduce the dimensionality of the search space. Our methods are general and allow some "false alarms" ( "false hits", or "false positives") but no false dismissals.

The resulting QBIC system offers effective retrieval using image content, and for large image dalabases significant speedup over straightforward indexing alternatives. The system is implemented in X/Motif and C running on an RS/6000,

By: C. Faloutsos, M. Flickner, W. Niblack, D. Petkovic, W. Equitz, R. Barber

Published in: RJ9453 in 1993

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rj9453.pdf

Questions about this service can be mailed to reports@us.ibm.com .